Dataset statistics
| Number of variables | 41 |
|---|---|
| Number of observations | 59400 |
| Missing cells | 46094 |
| Missing cells (%) | 1.9% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 19.0 MiB |
| Average record size in memory | 336.0 B |
Variable types
| CAT | 29 |
|---|---|
| NUM | 10 |
| BOOL | 2 |
recorded_by has constant value "59400" | Constant |
date_recorded has a high cardinality: 356 distinct values | High cardinality |
funder has a high cardinality: 1897 distinct values | High cardinality |
installer has a high cardinality: 2145 distinct values | High cardinality |
wpt_name has a high cardinality: 37400 distinct values | High cardinality |
subvillage has a high cardinality: 19287 distinct values | High cardinality |
lga has a high cardinality: 125 distinct values | High cardinality |
ward has a high cardinality: 2092 distinct values | High cardinality |
scheme_name has a high cardinality: 2696 distinct values | High cardinality |
extraction_type_group is highly correlated with extraction_type and 1 other fields | High correlation |
extraction_type is highly correlated with extraction_type_group and 1 other fields | High correlation |
extraction_type_class is highly correlated with extraction_type and 1 other fields | High correlation |
management_group is highly correlated with management | High correlation |
management is highly correlated with management_group | High correlation |
payment_type is highly correlated with payment | High correlation |
payment is highly correlated with payment_type | High correlation |
quality_group is highly correlated with water_quality | High correlation |
water_quality is highly correlated with quality_group | High correlation |
quantity_group is highly correlated with quantity | High correlation |
quantity is highly correlated with quantity_group | High correlation |
source_type is highly correlated with source and 1 other fields | High correlation |
source is highly correlated with source_type and 1 other fields | High correlation |
source_class is highly correlated with source and 1 other fields | High correlation |
waterpoint_type_group is highly correlated with waterpoint_type | High correlation |
waterpoint_type is highly correlated with waterpoint_type_group | High correlation |
funder has 3635 (6.1%) missing values | Missing |
installer has 3655 (6.2%) missing values | Missing |
public_meeting has 3334 (5.6%) missing values | Missing |
scheme_management has 3877 (6.5%) missing values | Missing |
scheme_name has 28166 (47.4%) missing values | Missing |
permit has 3056 (5.1%) missing values | Missing |
amount_tsh is highly skewed (γ1 = 57.80779995) | Skewed |
num_private is highly skewed (γ1 = 91.93374999) | Skewed |
id has unique values | Unique |
amount_tsh has 41639 (70.1%) zeros | Zeros |
gps_height has 20438 (34.4%) zeros | Zeros |
longitude has 1812 (3.1%) zeros | Zeros |
num_private has 58643 (98.7%) zeros | Zeros |
population has 21381 (36.0%) zeros | Zeros |
construction_year has 20709 (34.9%) zeros | Zeros |
Reproduction
| Analysis started | 2021-09-13 07:17:45.337923 |
|---|---|
| Analysis finished | 2021-09-13 07:18:12.236491 |
| Duration | 26.9 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 59400 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37115.13177 |
|---|---|
| Minimum | 0 |
| Maximum | 74247 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 464.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3730.9 |
| Q1 | 18519.75 |
| median | 37061.5 |
| Q3 | 55656.5 |
| 95-th percentile | 70564.05 |
| Maximum | 74247 |
| Range | 74247 |
| Interquartile range (IQR) | 37136.75 |
Descriptive statistics
| Standard deviation | 21453.12837 |
|---|---|
| Coefficient of variation (CV) | 0.5780156866 |
| Kurtosis | -1.201515029 |
| Mean | 37115.13177 |
| Median Absolute Deviation (MAD) | 18568.5 |
| Skewness | 0.00262253035 |
| Sum | 2204638827 |
| Variance | 460236716.9 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 72310 | 1 | < 0.1% | |
| 49805 | 1 | < 0.1% | |
| 51852 | 1 | < 0.1% | |
| 62091 | 1 | < 0.1% | |
| 64138 | 1 | < 0.1% | |
| 57993 | 1 | < 0.1% | |
| 60040 | 1 | < 0.1% | |
| 33413 | 1 | < 0.1% | |
| 35460 | 1 | < 0.1% | |
| Other values (59390) | 59390 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 74247 | 1 | < 0.1% | |
| 74246 | 1 | < 0.1% | |
| 74243 | 1 | < 0.1% | |
| 74242 | 1 | < 0.1% | |
| 74240 | 1 | < 0.1% |
| Distinct | 98 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 317.6503847 |
|---|---|
| Minimum | 0 |
| Maximum | 350000 |
| Zeros | 41639 |
| Zeros (%) | 70.1% |
| Memory size | 464.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 20 |
| 95-th percentile | 1200 |
| Maximum | 350000 |
| Range | 350000 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 2997.574558 |
|---|---|
| Coefficient of variation (CV) | 9.436709989 |
| Kurtosis | 4903.543102 |
| Mean | 317.6503847 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 57.80779995 |
| Sum | 18868432.85 |
| Variance | 8985453.232 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 41639 | 70.1% | |
| 500 | 3102 | 5.2% | |
| 50 | 2472 | 4.2% | |
| 1000 | 1488 | 2.5% | |
| 20 | 1463 | 2.5% | |
| 200 | 1220 | 2.1% | |
| 100 | 816 | 1.4% | |
| 10 | 806 | 1.4% | |
| 30 | 743 | 1.3% | |
| 2000 | 704 | 1.2% | |
| Other values (88) | 4947 | 8.3% |
| Value | Count | Frequency (%) | |
| 0 | 41639 | 70.1% | |
| 0.2 | 3 | < 0.1% | |
| 0.25 | 1 | < 0.1% | |
| 1 | 3 | < 0.1% | |
| 2 | 13 | < 0.1% |
| Value | Count | Frequency (%) | |
| 350000 | 1 | < 0.1% | |
| 250000 | 1 | < 0.1% | |
| 200000 | 1 | < 0.1% | |
| 170000 | 1 | < 0.1% | |
| 138000 | 1 | < 0.1% |
| Distinct | 356 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| 2011-03-15 | 572 |
|---|---|
| 2011-03-17 | 558 |
| 2013-02-03 | 546 |
| 2011-03-14 | 520 |
| 2011-03-16 | 513 |
| Other values (351) |
| Value | Count | Frequency (%) | |
| 2011-03-15 | 572 | 1.0% | |
| 2011-03-17 | 558 | 0.9% | |
| 2013-02-03 | 546 | 0.9% | |
| 2011-03-14 | 520 | 0.9% | |
| 2011-03-16 | 513 | 0.9% | |
| 2011-03-18 | 497 | 0.8% | |
| 2011-03-19 | 466 | 0.8% | |
| 2013-02-04 | 464 | 0.8% | |
| 2013-01-29 | 459 | 0.8% | |
| 2011-03-04 | 458 | 0.8% | |
| Other values (346) | 54347 | 91.5% |
Unique
| Unique | 35 ? |
|---|---|
| Unique (%) | 0.1% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
| Distinct | 1897 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 3635 |
| Missing (%) | 6.1% |
| Memory size | 464.1 KiB |
| Government Of Tanzania | |
|---|---|
| Danida | 3114 |
| Hesawa | 2202 |
| Rwssp | 1374 |
| World Bank | 1349 |
| Other values (1892) |
| Value | Count | Frequency (%) | |
| Government Of Tanzania | 9084 | 15.3% | |
| Danida | 3114 | 5.2% | |
| Hesawa | 2202 | 3.7% | |
| Rwssp | 1374 | 2.3% | |
| World Bank | 1349 | 2.3% | |
| Kkkt | 1287 | 2.2% | |
| World Vision | 1246 | 2.1% | |
| Unicef | 1057 | 1.8% | |
| Tasaf | 877 | 1.5% | |
| District Council | 843 | 1.4% | |
| Other values (1887) | 33332 | 56.1% | |
| (Missing) | 3635 | 6.1% |
Unique
| Unique | 974 ? |
|---|---|
| Unique (%) | 1.7% |
Length
| Max length | 30 |
|---|---|
| Median length | 6 |
| Mean length | 9.505824916 |
| Min length | 1 |
| Distinct | 2428 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 668.2972391 |
|---|---|
| Minimum | -90 |
| Maximum | 2770 |
| Zeros | 20438 |
| Zeros (%) | 34.4% |
| Memory size | 464.1 KiB |
Quantile statistics
| Minimum | -90 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 369 |
| Q3 | 1319.25 |
| 95-th percentile | 1797 |
| Maximum | 2770 |
| Range | 2860 |
| Interquartile range (IQR) | 1319.25 |
Descriptive statistics
| Standard deviation | 693.1163503 |
|---|---|
| Coefficient of variation (CV) | 1.037137833 |
| Kurtosis | -1.292440135 |
| Mean | 668.2972391 |
| Median Absolute Deviation (MAD) | 369 |
| Skewness | 0.462402085 |
| Sum | 39696856 |
| Variance | 480410.2751 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 20438 | 34.4% | |
| -15 | 60 | 0.1% | |
| -16 | 55 | 0.1% | |
| -13 | 55 | 0.1% | |
| -20 | 52 | 0.1% | |
| 1290 | 52 | 0.1% | |
| -14 | 51 | 0.1% | |
| 303 | 51 | 0.1% | |
| -18 | 49 | 0.1% | |
| -19 | 47 | 0.1% | |
| Other values (2418) | 38490 | 64.8% |
| Value | Count | Frequency (%) | |
| -90 | 1 | < 0.1% | |
| -63 | 2 | < 0.1% | |
| -59 | 1 | < 0.1% | |
| -57 | 1 | < 0.1% | |
| -55 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2770 | 1 | < 0.1% | |
| 2628 | 1 | < 0.1% | |
| 2627 | 1 | < 0.1% | |
| 2626 | 2 | < 0.1% | |
| 2623 | 1 | < 0.1% |
| Distinct | 2145 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 3655 |
| Missing (%) | 6.2% |
| Memory size | 464.1 KiB |
| DWE | |
|---|---|
| Government | 1825 |
| RWE | 1206 |
| Commu | 1060 |
| DANIDA | 1050 |
| Other values (2140) |
| Value | Count | Frequency (%) | |
| DWE | 17402 | 29.3% | |
| Government | 1825 | 3.1% | |
| RWE | 1206 | 2.0% | |
| Commu | 1060 | 1.8% | |
| DANIDA | 1050 | 1.8% | |
| KKKT | 898 | 1.5% | |
| Hesawa | 840 | 1.4% | |
| 0 | 777 | 1.3% | |
| TCRS | 707 | 1.2% | |
| Central government | 622 | 1.0% | |
| Other values (2135) | 29358 | 49.4% | |
| (Missing) | 3655 | 6.2% |
Unique
| Unique | 1098 ? |
|---|---|
| Unique (%) | 2.0% |
Length
| Max length | 30 |
|---|---|
| Median length | 4 |
| Mean length | 5.91976431 |
| Min length | 1 |
| Distinct | 57516 |
|---|---|
| Distinct (%) | 96.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 34.07742669 |
|---|---|
| Minimum | 0 |
| Maximum | 40.34519307 |
| Zeros | 1812 |
| Zeros (%) | 3.1% |
| Memory size | 464.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 30.04066001 |
| Q1 | 33.09034738 |
| median | 34.90874343 |
| Q3 | 37.17838657 |
| 95-th percentile | 39.13323954 |
| Maximum | 40.34519307 |
| Range | 40.34519307 |
| Interquartile range (IQR) | 4.08803919 |
Descriptive statistics
| Standard deviation | 6.567431846 |
|---|---|
| Coefficient of variation (CV) | 0.1927208854 |
| Kurtosis | 19.18703105 |
| Mean | 34.07742669 |
| Median Absolute Deviation (MAD) | 2.032511095 |
| Skewness | -4.191046455 |
| Sum | 2024199.146 |
| Variance | 43.13116105 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 1812 | 3.1% | |
| 37.54090064 | 2 | < 0.1% | |
| 33.01050977 | 2 | < 0.1% | |
| 39.09348389 | 2 | < 0.1% | |
| 32.9727187 | 2 | < 0.1% | |
| 33.00627548 | 2 | < 0.1% | |
| 39.10395018 | 2 | < 0.1% | |
| 37.54278497 | 2 | < 0.1% | |
| 36.80248988 | 2 | < 0.1% | |
| 39.09837398 | 2 | < 0.1% | |
| Other values (57506) | 57570 | 96.9% |
| Value | Count | Frequency (%) | |
| 0 | 1812 | 3.1% | |
| 29.6071219 | 1 | < 0.1% | |
| 29.60720109 | 1 | < 0.1% | |
| 29.61032056 | 1 | < 0.1% | |
| 29.61096482 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 40.34519307 | 1 | < 0.1% | |
| 40.34430089 | 1 | < 0.1% | |
| 40.32523996 | 1 | < 0.1% | |
| 40.32522643 | 1 | < 0.1% | |
| 40.32340181 | 1 | < 0.1% |
latitude
Real number (ℝ)
| Distinct | 57517 |
|---|---|
| Distinct (%) | 96.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -5.70603266 |
|---|---|
| Minimum | -11.64944018 |
| Maximum | -2e-08 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 464.1 KiB |
Quantile statistics
| Minimum | -11.64944018 |
|---|---|
| 5-th percentile | -10.58554992 |
| Q1 | -8.540621305 |
| median | -5.02159665 |
| Q3 | -3.32615564 |
| 95-th percentile | -1.408872227 |
| Maximum | -2e-08 |
| Range | 11.64944016 |
| Interquartile range (IQR) | 5.214465665 |
Descriptive statistics
| Standard deviation | 2.946019081 |
|---|---|
| Coefficient of variation (CV) | -0.5162990219 |
| Kurtosis | -1.057616666 |
| Mean | -5.70603266 |
| Median Absolute Deviation (MAD) | 2.07002988 |
| Skewness | -0.1520365709 |
| Sum | -338938.34 |
| Variance | 8.679028427 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| -2e-08 | 1812 | 3.1% | |
| -6.98584173 | 2 | < 0.1% | |
| -3.79757861 | 2 | < 0.1% | |
| -6.98188419 | 2 | < 0.1% | |
| -7.10462503 | 2 | < 0.1% | |
| -7.05692253 | 2 | < 0.1% | |
| -7.17517443 | 2 | < 0.1% | |
| -6.99073094 | 2 | < 0.1% | |
| -6.9787555 | 2 | < 0.1% | |
| -6.99470401 | 2 | < 0.1% | |
| Other values (57507) | 57570 | 96.9% |
| Value | Count | Frequency (%) | |
| -11.64944018 | 1 | < 0.1% | |
| -11.64837759 | 1 | < 0.1% | |
| -11.58629656 | 1 | < 0.1% | |
| -11.56857679 | 1 | < 0.1% | |
| -11.56680457 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| -2e-08 | 1812 | 3.1% | |
| -0.99846435 | 1 | < 0.1% | |
| -0.998916 | 1 | < 0.1% | |
| -0.99901209 | 1 | < 0.1% | |
| -0.99911702 | 1 | < 0.1% |
| Distinct | 37400 |
|---|---|
| Distinct (%) | 63.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| none | 3563 |
|---|---|
| Shuleni | 1748 |
| Zahanati | 830 |
| Msikitini | 535 |
| Kanisani | 323 |
| Other values (37395) |
| Value | Count | Frequency (%) | |
| none | 3563 | 6.0% | |
| Shuleni | 1748 | 2.9% | |
| Zahanati | 830 | 1.4% | |
| Msikitini | 535 | 0.9% | |
| Kanisani | 323 | 0.5% | |
| Bombani | 271 | 0.5% | |
| Sokoni | 260 | 0.4% | |
| Ofisini | 254 | 0.4% | |
| School | 208 | 0.4% | |
| Shule Ya Msingi | 199 | 0.3% | |
| Other values (37390) | 51209 | 86.2% |
Unique
| Unique | 32928 ? |
|---|---|
| Unique (%) | 55.4% |
Length
| Max length | 30 |
|---|---|
| Median length | 10 |
| Mean length | 10.96210438 |
| Min length | 1 |
| Distinct | 65 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4741414141 |
|---|---|
| Minimum | 0 |
| Maximum | 1776 |
| Zeros | 58643 |
| Zeros (%) | 98.7% |
| Memory size | 464.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 1776 |
| Range | 1776 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 12.23622981 |
|---|---|
| Coefficient of variation (CV) | 25.80713147 |
| Kurtosis | 11137.29521 |
| Mean | 0.4741414141 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 91.93374999 |
| Sum | 28164 |
| Variance | 149.72532 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 58643 | 98.7% | |
| 6 | 81 | 0.1% | |
| 1 | 73 | 0.1% | |
| 5 | 46 | 0.1% | |
| 8 | 46 | 0.1% | |
| 32 | 40 | 0.1% | |
| 45 | 36 | 0.1% | |
| 15 | 35 | 0.1% | |
| 39 | 30 | 0.1% | |
| 93 | 28 | < 0.1% | |
| Other values (55) | 342 | 0.6% |
| Value | Count | Frequency (%) | |
| 0 | 58643 | 98.7% | |
| 1 | 73 | 0.1% | |
| 2 | 23 | < 0.1% | |
| 3 | 27 | < 0.1% | |
| 4 | 20 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1776 | 1 | < 0.1% | |
| 1402 | 1 | < 0.1% | |
| 755 | 1 | < 0.1% | |
| 698 | 1 | < 0.1% | |
| 672 | 1 | < 0.1% |
basin
Categorical
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| Lake Victoria | |
|---|---|
| Pangani | |
| Rufiji | |
| Internal | |
| Lake Tanganyika | |
| Other values (4) |
| Value | Count | Frequency (%) | |
| Lake Victoria | 10248 | 17.3% | |
| Pangani | 8940 | 15.1% | |
| Rufiji | 7976 | 13.4% | |
| Internal | 7785 | 13.1% | |
| Lake Tanganyika | 6432 | 10.8% | |
| Wami / Ruvu | 5987 | 10.1% | |
| Lake Nyasa | 5085 | 8.6% | |
| Ruvuma / Southern Coast | 4493 | 7.6% | |
| Lake Rukwa | 2454 | 4.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 23 |
|---|---|
| Median length | 10 |
| Mean length | 10.8923569 |
| Min length | 6 |
| Distinct | 19287 |
|---|---|
| Distinct (%) | 32.7% |
| Missing | 371 |
| Missing (%) | 0.6% |
| Memory size | 464.1 KiB |
| Madukani | 508 |
|---|---|
| Shuleni | 506 |
| Majengo | 502 |
| Kati | 373 |
| Mtakuja | 262 |
| Other values (19282) |
| Value | Count | Frequency (%) | |
| Madukani | 508 | 0.9% | |
| Shuleni | 506 | 0.9% | |
| Majengo | 502 | 0.8% | |
| Kati | 373 | 0.6% | |
| Mtakuja | 262 | 0.4% | |
| Sokoni | 232 | 0.4% | |
| M | 187 | 0.3% | |
| Muungano | 172 | 0.3% | |
| Mbuyuni | 164 | 0.3% | |
| Mlimani | 152 | 0.3% | |
| Other values (19277) | 55971 | 94.2% | |
| (Missing) | 371 | 0.6% |
Unique
| Unique | 9424 ? |
|---|---|
| Unique (%) | 16.0% |
Length
| Max length | 30 |
|---|---|
| Median length | 7 |
| Mean length | 7.867003367 |
| Min length | 1 |
region
Categorical
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| Iringa | |
|---|---|
| Shinyanga | |
| Mbeya | |
| Kilimanjaro | |
| Morogoro | |
| Other values (16) |
| Value | Count | Frequency (%) | |
| Iringa | 5294 | 8.9% | |
| Shinyanga | 4982 | 8.4% | |
| Mbeya | 4639 | 7.8% | |
| Kilimanjaro | 4379 | 7.4% | |
| Morogoro | 4006 | 6.7% | |
| Arusha | 3350 | 5.6% | |
| Kagera | 3316 | 5.6% | |
| Mwanza | 3102 | 5.2% | |
| Kigoma | 2816 | 4.7% | |
| Ruvuma | 2640 | 4.4% | |
| Other values (11) | 20876 | 35.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 13 |
|---|---|
| Median length | 6 |
| Mean length | 6.623754209 |
| Min length | 4 |
region_code
Real number (ℝ≥0)
| Distinct | 27 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.29700337 |
|---|---|
| Minimum | 1 |
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 464.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 12 |
| Q3 | 17 |
| 95-th percentile | 60 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 17.58740634 |
|---|---|
| Coefficient of variation (CV) | 1.149728866 |
| Kurtosis | 10.28843341 |
| Mean | 15.29700337 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 3.17381811 |
| Sum | 908642 |
| Variance | 309.3168617 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 11 | 5300 | 8.9% | |
| 17 | 5011 | 8.4% | |
| 12 | 4639 | 7.8% | |
| 3 | 4379 | 7.4% | |
| 5 | 4040 | 6.8% | |
| 18 | 3324 | 5.6% | |
| 19 | 3047 | 5.1% | |
| 2 | 3024 | 5.1% | |
| 16 | 2816 | 4.7% | |
| 10 | 2640 | 4.4% | |
| Other values (17) | 21180 | 35.7% |
| Value | Count | Frequency (%) | |
| 1 | 2201 | 3.7% | |
| 2 | 3024 | 5.1% | |
| 3 | 4379 | 7.4% | |
| 4 | 2513 | 4.2% | |
| 5 | 4040 | 6.8% |
| Value | Count | Frequency (%) | |
| 99 | 423 | 0.7% | |
| 90 | 917 | 1.5% | |
| 80 | 1238 | 2.1% | |
| 60 | 1025 | 1.7% | |
| 40 | 1 | < 0.1% |
district_code
Real number (ℝ≥0)
| Distinct | 20 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.629747475 |
|---|---|
| Minimum | 0 |
| Maximum | 80 |
| Zeros | 23 |
| Zeros (%) | < 0.1% |
| Memory size | 464.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 30 |
| Maximum | 80 |
| Range | 80 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 9.633648629 |
|---|---|
| Coefficient of variation (CV) | 1.711204396 |
| Kurtosis | 16.21428363 |
| Mean | 5.629747475 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 3.962045299 |
| Sum | 334407 |
| Variance | 92.80718592 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 12203 | 20.5% | |
| 2 | 11173 | 18.8% | |
| 3 | 9998 | 16.8% | |
| 4 | 8999 | 15.1% | |
| 5 | 4356 | 7.3% | |
| 6 | 4074 | 6.9% | |
| 7 | 3343 | 5.6% | |
| 8 | 1043 | 1.8% | |
| 30 | 995 | 1.7% | |
| 33 | 874 | 1.5% | |
| Other values (10) | 2342 | 3.9% |
| Value | Count | Frequency (%) | |
| 0 | 23 | < 0.1% | |
| 1 | 12203 | 20.5% | |
| 2 | 11173 | 18.8% | |
| 3 | 9998 | 16.8% | |
| 4 | 8999 | 15.1% |
| Value | Count | Frequency (%) | |
| 80 | 12 | < 0.1% | |
| 67 | 6 | < 0.1% | |
| 63 | 195 | 0.3% | |
| 62 | 109 | 0.2% | |
| 60 | 63 | 0.1% |
| Distinct | 125 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| Njombe | 2503 |
|---|---|
| Arusha Rural | 1252 |
| Moshi Rural | 1251 |
| Bariadi | 1177 |
| Rungwe | 1106 |
| Other values (120) |
| Value | Count | Frequency (%) | |
| Njombe | 2503 | 4.2% | |
| Arusha Rural | 1252 | 2.1% | |
| Moshi Rural | 1251 | 2.1% | |
| Bariadi | 1177 | 2.0% | |
| Rungwe | 1106 | 1.9% | |
| Kilosa | 1094 | 1.8% | |
| Kasulu | 1047 | 1.8% | |
| Mbozi | 1034 | 1.7% | |
| Meru | 1009 | 1.7% | |
| Bagamoyo | 997 | 1.7% | |
| Other values (115) | 46930 | 79.0% |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Length
| Max length | 16 |
|---|---|
| Median length | 6 |
| Mean length | 7.416885522 |
| Min length | 3 |
| Distinct | 2092 |
|---|---|
| Distinct (%) | 3.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| Igosi | 307 |
|---|---|
| Imalinyi | 252 |
| Siha Kati | 232 |
| Mdandu | 231 |
| Nduruma | 217 |
| Other values (2087) |
| Value | Count | Frequency (%) | |
| Igosi | 307 | 0.5% | |
| Imalinyi | 252 | 0.4% | |
| Siha Kati | 232 | 0.4% | |
| Mdandu | 231 | 0.4% | |
| Nduruma | 217 | 0.4% | |
| Kitunda | 203 | 0.3% | |
| Mishamo | 203 | 0.3% | |
| Msindo | 201 | 0.3% | |
| Chalinze | 196 | 0.3% | |
| Maji ya Chai | 190 | 0.3% | |
| Other values (2082) | 57168 | 96.2% |
Unique
| Unique | 30 ? |
|---|---|
| Unique (%) | 0.1% |
Length
| Max length | 23 |
|---|---|
| Median length | 7 |
| Mean length | 7.505841751 |
| Min length | 3 |
| Distinct | 1049 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 179.9099832 |
|---|---|
| Minimum | 0 |
| Maximum | 30500 |
| Zeros | 21381 |
| Zeros (%) | 36.0% |
| Memory size | 464.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 25 |
| Q3 | 215 |
| 95-th percentile | 680 |
| Maximum | 30500 |
| Range | 30500 |
| Interquartile range (IQR) | 215 |
Descriptive statistics
| Standard deviation | 471.4821757 |
|---|---|
| Coefficient of variation (CV) | 2.620655994 |
| Kurtosis | 402.2801153 |
| Mean | 179.9099832 |
| Median Absolute Deviation (MAD) | 25 |
| Skewness | 12.66071359 |
| Sum | 10686653 |
| Variance | 222295.442 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 21381 | 36.0% | |
| 1 | 7025 | 11.8% | |
| 200 | 1940 | 3.3% | |
| 150 | 1892 | 3.2% | |
| 250 | 1681 | 2.8% | |
| 300 | 1476 | 2.5% | |
| 100 | 1146 | 1.9% | |
| 50 | 1139 | 1.9% | |
| 500 | 1009 | 1.7% | |
| 350 | 986 | 1.7% | |
| Other values (1039) | 19725 | 33.2% |
| Value | Count | Frequency (%) | |
| 0 | 21381 | 36.0% | |
| 1 | 7025 | 11.8% | |
| 2 | 4 | < 0.1% | |
| 3 | 4 | < 0.1% | |
| 4 | 13 | < 0.1% |
| Value | Count | Frequency (%) | |
| 30500 | 1 | < 0.1% | |
| 15300 | 1 | < 0.1% | |
| 11463 | 1 | < 0.1% | |
| 10000 | 3 | < 0.1% | |
| 9865 | 1 | < 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3334 |
| Missing (%) | 5.6% |
| Memory size | 464.1 KiB |
| True | |
|---|---|
| False | 5055 |
| (Missing) | 3334 |
| Value | Count | Frequency (%) | |
| True | 51011 | 85.9% | |
| False | 5055 | 8.5% | |
| (Missing) | 3334 | 5.6% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| GeoData Consultants Ltd |
|---|
| Value | Count | Frequency (%) | |
| GeoData Consultants Ltd | 59400 | 100.0% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3877 |
| Missing (%) | 6.5% |
| Memory size | 464.1 KiB |
| VWC | |
|---|---|
| WUG | |
| Water authority | 3153 |
| WUA | 2883 |
| Water Board | 2748 |
| Other values (7) |
| Value | Count | Frequency (%) | |
| VWC | 36793 | 61.9% | |
| WUG | 5206 | 8.8% | |
| Water authority | 3153 | 5.3% | |
| WUA | 2883 | 4.9% | |
| Water Board | 2748 | 4.6% | |
| Parastatal | 1680 | 2.8% | |
| Private operator | 1063 | 1.8% | |
| Company | 1061 | 1.8% | |
| Other | 766 | 1.3% | |
| SWC | 97 | 0.2% | |
| Other values (2) | 73 | 0.1% | |
| (Missing) | 3877 | 6.5% |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Length
| Max length | 16 |
|---|---|
| Median length | 3 |
| Mean length | 4.537373737 |
| Min length | 3 |
| Distinct | 2696 |
|---|---|
| Distinct (%) | 8.6% |
| Missing | 28166 |
| Missing (%) | 47.4% |
| Memory size | 464.1 KiB |
| K | 682 |
|---|---|
| None | 644 |
| Borehole | 546 |
| Chalinze wate | 405 |
| M | 400 |
| Other values (2691) |
| Value | Count | Frequency (%) | |
| K | 682 | 1.1% | |
| None | 644 | 1.1% | |
| Borehole | 546 | 0.9% | |
| Chalinze wate | 405 | 0.7% | |
| M | 400 | 0.7% | |
| DANIDA | 379 | 0.6% | |
| Government | 320 | 0.5% | |
| Ngana water supplied scheme | 270 | 0.5% | |
| wanging'ombe water supply s | 261 | 0.4% | |
| wanging'ombe supply scheme | 234 | 0.4% | |
| Other values (2686) | 27093 | 45.6% | |
| (Missing) | 28166 | 47.4% |
Unique
| Unique | 712 ? |
|---|---|
| Unique (%) | 2.3% |
Length
| Max length | 46 |
|---|---|
| Median length | 3 |
| Mean length | 8.94456229 |
| Min length | 1 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3056 |
| Missing (%) | 5.1% |
| Memory size | 464.1 KiB |
| True | |
|---|---|
| False | |
| (Missing) | 3056 |
| Value | Count | Frequency (%) | |
| True | 38852 | 65.4% | |
| False | 17492 | 29.4% | |
| (Missing) | 3056 | 5.1% |
| Distinct | 55 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1300.652475 |
|---|---|
| Minimum | 0 |
| Maximum | 2013 |
| Zeros | 20709 |
| Zeros (%) | 34.9% |
| Memory size | 464.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1986 |
| Q3 | 2004 |
| 95-th percentile | 2010 |
| Maximum | 2013 |
| Range | 2013 |
| Interquartile range (IQR) | 2004 |
Descriptive statistics
| Standard deviation | 951.6205473 |
|---|---|
| Coefficient of variation (CV) | 0.7316485885 |
| Kurtosis | -1.596432369 |
| Mean | 1300.652475 |
| Median Absolute Deviation (MAD) | 22 |
| Skewness | -0.6349277866 |
| Sum | 77258757 |
| Variance | 905581.6661 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 20709 | 34.9% | |
| 2010 | 2645 | 4.5% | |
| 2008 | 2613 | 4.4% | |
| 2009 | 2533 | 4.3% | |
| 2000 | 2091 | 3.5% | |
| 2007 | 1587 | 2.7% | |
| 2006 | 1471 | 2.5% | |
| 2003 | 1286 | 2.2% | |
| 2011 | 1256 | 2.1% | |
| 2004 | 1123 | 1.9% | |
| Other values (45) | 22086 | 37.2% |
| Value | Count | Frequency (%) | |
| 0 | 20709 | 34.9% | |
| 1960 | 102 | 0.2% | |
| 1961 | 21 | < 0.1% | |
| 1962 | 30 | 0.1% | |
| 1963 | 85 | 0.1% |
| Value | Count | Frequency (%) | |
| 2013 | 176 | 0.3% | |
| 2012 | 1084 | 1.8% | |
| 2011 | 1256 | 2.1% | |
| 2010 | 2645 | 4.5% | |
| 2009 | 2533 | 4.3% |
| Distinct | 18 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| gravity | |
|---|---|
| nira/tanira | |
| other | |
| submersible | |
| swn 80 | |
| Other values (13) |
| Value | Count | Frequency (%) | |
| gravity | 26780 | 45.1% | |
| nira/tanira | 8154 | 13.7% | |
| other | 6430 | 10.8% | |
| submersible | 4764 | 8.0% | |
| swn 80 | 3670 | 6.2% | |
| mono | 2865 | 4.8% | |
| india mark ii | 2400 | 4.0% | |
| afridev | 1770 | 3.0% | |
| ksb | 1415 | 2.4% | |
| other - rope pump | 451 | 0.8% | |
| Other values (8) | 701 | 1.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 25 |
|---|---|
| Median length | 7 |
| Mean length | 7.719511785 |
| Min length | 3 |
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| gravity | |
|---|---|
| nira/tanira | |
| other | |
| submersible | |
| swn 80 | |
| Other values (8) |
| Value | Count | Frequency (%) | |
| gravity | 26780 | 45.1% | |
| nira/tanira | 8154 | 13.7% | |
| other | 6430 | 10.8% | |
| submersible | 6179 | 10.4% | |
| swn 80 | 3670 | 6.2% | |
| mono | 2865 | 4.8% | |
| india mark ii | 2400 | 4.0% | |
| afridev | 1770 | 3.0% | |
| rope pump | 451 | 0.8% | |
| other handpump | 364 | 0.6% | |
| Other values (3) | 337 | 0.6% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 15 |
|---|---|
| Median length | 7 |
| Mean length | 7.880538721 |
| Min length | 4 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| gravity | |
|---|---|
| handpump | |
| other | |
| submersible | |
| motorpump | |
| Other values (2) | 568 |
| Value | Count | Frequency (%) | |
| gravity | 26780 | 45.1% | |
| handpump | 16456 | 27.7% | |
| other | 6430 | 10.8% | |
| submersible | 6179 | 10.4% | |
| motorpump | 2987 | 5.0% | |
| rope pump | 451 | 0.8% | |
| wind-powered | 117 | 0.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 12 |
|---|---|
| Median length | 7 |
| Mean length | 7.602239057 |
| Min length | 5 |
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| vwc | |
|---|---|
| wug | |
| water board | 2933 |
| wua | 2535 |
| private operator | 1971 |
| Other values (7) |
| Value | Count | Frequency (%) | |
| vwc | 40507 | 68.2% | |
| wug | 6515 | 11.0% | |
| water board | 2933 | 4.9% | |
| wua | 2535 | 4.3% | |
| private operator | 1971 | 3.3% | |
| parastatal | 1768 | 3.0% | |
| water authority | 904 | 1.5% | |
| other | 844 | 1.4% | |
| company | 685 | 1.2% | |
| unknown | 561 | 0.9% | |
| Other values (2) | 177 | 0.3% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 16 |
|---|---|
| Median length | 3 |
| Mean length | 4.350639731 |
| Min length | 3 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| user-group | |
|---|---|
| commercial | 3638 |
| parastatal | 1768 |
| other | 943 |
| unknown | 561 |
| Value | Count | Frequency (%) | |
| user-group | 52490 | 88.4% | |
| commercial | 3638 | 6.1% | |
| parastatal | 1768 | 3.0% | |
| other | 943 | 1.6% | |
| unknown | 561 | 0.9% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.892289562 |
| Min length | 5 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| never pay | |
|---|---|
| pay per bucket | |
| pay monthly | |
| unknown | |
| pay when scheme fails | |
| Other values (2) |
| Value | Count | Frequency (%) | |
| never pay | 25348 | 42.7% | |
| pay per bucket | 8985 | 15.1% | |
| pay monthly | 8300 | 14.0% | |
| unknown | 8157 | 13.7% | |
| pay when scheme fails | 3914 | 6.6% | |
| pay annually | 3642 | 6.1% | |
| other | 1054 | 1.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 21 |
|---|---|
| Median length | 9 |
| Mean length | 10.66479798 |
| Min length | 5 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| never pay | |
|---|---|
| per bucket | |
| monthly | |
| unknown | |
| on failure | |
| Other values (2) |
| Value | Count | Frequency (%) | |
| never pay | 25348 | 42.7% | |
| per bucket | 8985 | 15.1% | |
| monthly | 8300 | 14.0% | |
| unknown | 8157 | 13.7% | |
| on failure | 3914 | 6.6% | |
| annually | 3642 | 6.1% | |
| other | 1054 | 1.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.530757576 |
| Min length | 5 |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| soft | |
|---|---|
| salty | 4856 |
| unknown | 1876 |
| milky | 804 |
| coloured | 490 |
| Other values (3) | 556 |
| Value | Count | Frequency (%) | |
| soft | 50818 | 85.6% | |
| salty | 4856 | 8.2% | |
| unknown | 1876 | 3.2% | |
| milky | 804 | 1.4% | |
| coloured | 490 | 0.8% | |
| salty abandoned | 339 | 0.6% | |
| fluoride | 200 | 0.3% | |
| fluoride abandoned | 17 | < 0.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 18 |
|---|---|
| Median length | 4 |
| Mean length | 4.303282828 |
| Min length | 4 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| good | |
|---|---|
| salty | |
| unknown | 1876 |
| milky | 804 |
| colored | 490 |
| Value | Count | Frequency (%) | |
| good | 50818 | 85.6% | |
| salty | 5195 | 8.7% | |
| unknown | 1876 | 3.2% | |
| milky | 804 | 1.4% | |
| colored | 490 | 0.8% | |
| fluoride | 217 | 0.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 8 |
|---|---|
| Median length | 4 |
| Mean length | 4.23510101 |
| Min length | 4 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| enough | |
|---|---|
| insufficient | |
| dry | |
| seasonal | |
| unknown | 789 |
| Value | Count | Frequency (%) | |
| enough | 33186 | 55.9% | |
| insufficient | 15129 | 25.5% | |
| dry | 6246 | 10.5% | |
| seasonal | 4050 | 6.8% | |
| unknown | 789 | 1.3% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 7.362373737 |
| Min length | 3 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| enough | |
|---|---|
| insufficient | |
| dry | |
| seasonal | |
| unknown | 789 |
| Value | Count | Frequency (%) | |
| enough | 33186 | 55.9% | |
| insufficient | 15129 | 25.5% | |
| dry | 6246 | 10.5% | |
| seasonal | 4050 | 6.8% | |
| unknown | 789 | 1.3% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 7.362373737 |
| Min length | 3 |
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| spring | |
|---|---|
| shallow well | |
| machine dbh | |
| river | |
| rainwater harvesting | |
| Other values (5) |
| Value | Count | Frequency (%) | |
| spring | 17021 | 28.7% | |
| shallow well | 16824 | 28.3% | |
| machine dbh | 11075 | 18.6% | |
| river | 9612 | 16.2% | |
| rainwater harvesting | 2295 | 3.9% | |
| hand dtw | 874 | 1.5% | |
| lake | 765 | 1.3% | |
| dam | 656 | 1.1% | |
| other | 212 | 0.4% | |
| unknown | 66 | 0.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 20 |
|---|---|
| Median length | 11 |
| Mean length | 8.978804714 |
| Min length | 3 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| spring | |
|---|---|
| shallow well | |
| borehole | |
| river/lake | |
| rainwater harvesting | |
| Other values (2) | 934 |
| Value | Count | Frequency (%) | |
| spring | 17021 | 28.7% | |
| shallow well | 16824 | 28.3% | |
| borehole | 11949 | 20.1% | |
| river/lake | 10377 | 17.5% | |
| rainwater harvesting | 2295 | 3.9% | |
| dam | 656 | 1.1% | |
| other | 278 | 0.5% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 20 |
|---|---|
| Median length | 8 |
| Mean length | 9.303602694 |
| Min length | 3 |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| groundwater | |
|---|---|
| surface | |
| unknown | 278 |
| Value | Count | Frequency (%) | |
| groundwater | 45794 | 77.1% | |
| surface | 13328 | 22.4% | |
| unknown | 278 | 0.5% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 10.08377104 |
| Min length | 7 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| communal standpipe | |
|---|---|
| hand pump | |
| other | |
| communal standpipe multiple | |
| improved spring | 784 |
| Other values (2) | 123 |
| Value | Count | Frequency (%) | |
| communal standpipe | 28522 | 48.0% | |
| hand pump | 17488 | 29.4% | |
| other | 6380 | 10.7% | |
| communal standpipe multiple | 6103 | 10.3% | |
| improved spring | 784 | 1.3% | |
| cattle trough | 116 | 0.2% | |
| dam | 7 | < 0.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 27 |
|---|---|
| Median length | 18 |
| Mean length | 14.82757576 |
| Min length | 3 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| communal standpipe | |
|---|---|
| hand pump | |
| other | |
| improved spring | 784 |
| cattle trough | 116 |
| Value | Count | Frequency (%) | |
| communal standpipe | 34625 | 58.3% | |
| hand pump | 17488 | 29.4% | |
| other | 6380 | 10.7% | |
| improved spring | 784 | 1.3% | |
| cattle trough | 116 | 0.2% | |
| dam | 7 | < 0.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 18 |
|---|---|
| Median length | 18 |
| Mean length | 13.90287879 |
| Min length | 3 |
status_group
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.1 KiB |
| functional | |
|---|---|
| non functional | |
| functional needs repair |
| Value | Count | Frequency (%) | |
| functional | 32259 | 54.3% | |
| non functional | 22824 | 38.4% | |
| functional needs repair | 4317 | 7.3% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 23 |
|---|---|
| Median length | 10 |
| Mean length | 12.48176768 |
| Min length | 10 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| id | amount_tsh | date_recorded | funder | gps_height | installer | longitude | latitude | wpt_name | num_private | basin | subvillage | region | region_code | district_code | lga | ward | population | public_meeting | recorded_by | scheme_management | scheme_name | permit | construction_year | extraction_type | extraction_type_group | extraction_type_class | management | management_group | payment | payment_type | water_quality | quality_group | quantity | quantity_group | source | source_type | source_class | waterpoint_type | waterpoint_type_group | status_group | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 69572 | 6000.0 | 2011-03-14 | Roman | 1390 | Roman | 34.938093 | -9.856322 | none | 0 | Lake Nyasa | Mnyusi B | Iringa | 11 | 5 | Ludewa | Mundindi | 109 | True | GeoData Consultants Ltd | VWC | Roman | False | 1999 | gravity | gravity | gravity | vwc | user-group | pay annually | annually | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe | functional |
| 1 | 8776 | 0.0 | 2013-03-06 | Grumeti | 1399 | GRUMETI | 34.698766 | -2.147466 | Zahanati | 0 | Lake Victoria | Nyamara | Mara | 20 | 2 | Serengeti | Natta | 280 | NaN | GeoData Consultants Ltd | Other | NaN | True | 2010 | gravity | gravity | gravity | wug | user-group | never pay | never pay | soft | good | insufficient | insufficient | rainwater harvesting | rainwater harvesting | surface | communal standpipe | communal standpipe | functional |
| 2 | 34310 | 25.0 | 2013-02-25 | Lottery Club | 686 | World vision | 37.460664 | -3.821329 | Kwa Mahundi | 0 | Pangani | Majengo | Manyara | 21 | 4 | Simanjiro | Ngorika | 250 | True | GeoData Consultants Ltd | VWC | Nyumba ya mungu pipe scheme | True | 2009 | gravity | gravity | gravity | vwc | user-group | pay per bucket | per bucket | soft | good | enough | enough | dam | dam | surface | communal standpipe multiple | communal standpipe | functional |
| 3 | 67743 | 0.0 | 2013-01-28 | Unicef | 263 | UNICEF | 38.486161 | -11.155298 | Zahanati Ya Nanyumbu | 0 | Ruvuma / Southern Coast | Mahakamani | Mtwara | 90 | 63 | Nanyumbu | Nanyumbu | 58 | True | GeoData Consultants Ltd | VWC | NaN | True | 1986 | submersible | submersible | submersible | vwc | user-group | never pay | never pay | soft | good | dry | dry | machine dbh | borehole | groundwater | communal standpipe multiple | communal standpipe | non functional |
| 4 | 19728 | 0.0 | 2011-07-13 | Action In A | 0 | Artisan | 31.130847 | -1.825359 | Shuleni | 0 | Lake Victoria | Kyanyamisa | Kagera | 18 | 1 | Karagwe | Nyakasimbi | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | gravity | gravity | gravity | other | other | never pay | never pay | soft | good | seasonal | seasonal | rainwater harvesting | rainwater harvesting | surface | communal standpipe | communal standpipe | functional |
| 5 | 9944 | 20.0 | 2011-03-13 | Mkinga Distric Coun | 0 | DWE | 39.172796 | -4.765587 | Tajiri | 0 | Pangani | Moa/Mwereme | Tanga | 4 | 8 | Mkinga | Moa | 1 | True | GeoData Consultants Ltd | VWC | Zingibali | True | 2009 | submersible | submersible | submersible | vwc | user-group | pay per bucket | per bucket | salty | salty | enough | enough | other | other | unknown | communal standpipe multiple | communal standpipe | functional |
| 6 | 19816 | 0.0 | 2012-10-01 | Dwsp | 0 | DWSP | 33.362410 | -3.766365 | Kwa Ngomho | 0 | Internal | Ishinabulandi | Shinyanga | 17 | 3 | Shinyanga Rural | Samuye | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | swn 80 | swn 80 | handpump | vwc | user-group | never pay | never pay | soft | good | enough | enough | machine dbh | borehole | groundwater | hand pump | hand pump | non functional |
| 7 | 54551 | 0.0 | 2012-10-09 | Rwssp | 0 | DWE | 32.620617 | -4.226198 | Tushirikiane | 0 | Lake Tanganyika | Nyawishi Center | Shinyanga | 17 | 3 | Kahama | Chambo | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | nira/tanira | nira/tanira | handpump | wug | user-group | unknown | unknown | milky | milky | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump | non functional |
| 8 | 53934 | 0.0 | 2012-11-03 | Wateraid | 0 | Water Aid | 32.711100 | -5.146712 | Kwa Ramadhan Musa | 0 | Lake Tanganyika | Imalauduki | Tabora | 14 | 6 | Tabora Urban | Itetemia | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | india mark ii | india mark ii | handpump | vwc | user-group | never pay | never pay | salty | salty | seasonal | seasonal | machine dbh | borehole | groundwater | hand pump | hand pump | non functional |
| 9 | 46144 | 0.0 | 2011-08-03 | Isingiro Ho | 0 | Artisan | 30.626991 | -1.257051 | Kwapeto | 0 | Lake Victoria | Mkonomre | Kagera | 18 | 1 | Karagwe | Kaisho | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | nira/tanira | nira/tanira | handpump | vwc | user-group | never pay | never pay | soft | good | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump | functional |
Last rows
| id | amount_tsh | date_recorded | funder | gps_height | installer | longitude | latitude | wpt_name | num_private | basin | subvillage | region | region_code | district_code | lga | ward | population | public_meeting | recorded_by | scheme_management | scheme_name | permit | construction_year | extraction_type | extraction_type_group | extraction_type_class | management | management_group | payment | payment_type | water_quality | quality_group | quantity | quantity_group | source | source_type | source_class | waterpoint_type | waterpoint_type_group | status_group | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 59390 | 13677 | 0.0 | 2011-08-04 | Rudep | 1715 | DWE | 31.370848 | -8.258160 | Kwa Mzee Atanas | 0 | Lake Tanganyika | Kitonto | Rukwa | 15 | 2 | Sumbawanga Rural | Mkowe | 150 | True | GeoData Consultants Ltd | VWC | NaN | False | 1991 | swn 80 | swn 80 | handpump | vwc | user-group | never pay | never pay | soft | good | insufficient | insufficient | machine dbh | borehole | groundwater | hand pump | hand pump | functional |
| 59391 | 44885 | 0.0 | 2013-08-03 | Government Of Tanzania | 540 | Government | 38.044070 | -4.272218 | Kwa | 0 | Pangani | Maore Kati | Kilimanjaro | 3 | 3 | Same | Maore | 210 | True | GeoData Consultants Ltd | Water authority | Hingilili | True | 1967 | gravity | gravity | gravity | vwc | user-group | never pay | never pay | soft | good | enough | enough | river | river/lake | surface | communal standpipe | communal standpipe | non functional |
| 59392 | 40607 | 0.0 | 2011-04-15 | Government Of Tanzania | 0 | Government | 33.009440 | -8.520888 | Benard Charles | 0 | Lake Rukwa | Mbuyuni A | Mbeya | 12 | 1 | Chunya | Mbuyuni | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | gravity | gravity | gravity | vwc | user-group | never pay | never pay | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe | non functional |
| 59393 | 48348 | 0.0 | 2012-10-27 | Private | 0 | Private | 33.866852 | -4.287410 | Kwa Peter | 0 | Internal | Masanga | Tabora | 14 | 2 | Igunga | Igunga | 0 | False | GeoData Consultants Ltd | Water authority | NaN | False | 0 | gravity | gravity | gravity | private operator | commercial | pay per bucket | per bucket | soft | good | insufficient | insufficient | dam | dam | surface | other | other | functional |
| 59394 | 11164 | 500.0 | 2011-03-09 | World Bank | 351 | ML appro | 37.634053 | -6.124830 | Chimeredya | 0 | Wami / Ruvu | Komstari | Morogoro | 5 | 6 | Mvomero | Diongoya | 89 | True | GeoData Consultants Ltd | VWC | NaN | True | 2007 | submersible | submersible | submersible | vwc | user-group | pay monthly | monthly | soft | good | enough | enough | machine dbh | borehole | groundwater | communal standpipe | communal standpipe | non functional |
| 59395 | 60739 | 10.0 | 2013-05-03 | Germany Republi | 1210 | CES | 37.169807 | -3.253847 | Area Three Namba 27 | 0 | Pangani | Kiduruni | Kilimanjaro | 3 | 5 | Hai | Masama Magharibi | 125 | True | GeoData Consultants Ltd | Water Board | Losaa Kia water supply | True | 1999 | gravity | gravity | gravity | water board | user-group | pay per bucket | per bucket | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe | functional |
| 59396 | 27263 | 4700.0 | 2011-05-07 | Cefa-njombe | 1212 | Cefa | 35.249991 | -9.070629 | Kwa Yahona Kuvala | 0 | Rufiji | Igumbilo | Iringa | 11 | 4 | Njombe | Ikondo | 56 | True | GeoData Consultants Ltd | VWC | Ikondo electrical water sch | True | 1996 | gravity | gravity | gravity | vwc | user-group | pay annually | annually | soft | good | enough | enough | river | river/lake | surface | communal standpipe | communal standpipe | functional |
| 59397 | 37057 | 0.0 | 2011-04-11 | NaN | 0 | NaN | 34.017087 | -8.750434 | Mashine | 0 | Rufiji | Madungulu | Mbeya | 12 | 7 | Mbarali | Chimala | 0 | True | GeoData Consultants Ltd | VWC | NaN | False | 0 | swn 80 | swn 80 | handpump | vwc | user-group | pay monthly | monthly | fluoride | fluoride | enough | enough | machine dbh | borehole | groundwater | hand pump | hand pump | functional |
| 59398 | 31282 | 0.0 | 2011-03-08 | Malec | 0 | Musa | 35.861315 | -6.378573 | Mshoro | 0 | Rufiji | Mwinyi | Dodoma | 1 | 4 | Chamwino | Mvumi Makulu | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | nira/tanira | nira/tanira | handpump | vwc | user-group | never pay | never pay | soft | good | insufficient | insufficient | shallow well | shallow well | groundwater | hand pump | hand pump | functional |
| 59399 | 26348 | 0.0 | 2011-03-23 | World Bank | 191 | World | 38.104048 | -6.747464 | Kwa Mzee Lugawa | 0 | Wami / Ruvu | Kikatanyemba | Morogoro | 5 | 2 | Morogoro Rural | Ngerengere | 150 | True | GeoData Consultants Ltd | VWC | NaN | True | 2002 | nira/tanira | nira/tanira | handpump | vwc | user-group | pay when scheme fails | on failure | salty | salty | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump | functional |